Learning Trees and Rules with Set-Valued Features

نویسنده

  • William W. Cohen
چکیده

In most learning systems examples are represented as xed-length \feature vectors", the components of which are either real numbers or nominal values. We propose an extension of the feature-vector representation that allows the value of a feature to be a set of strings; for instance, to represent a small white and black dog with the nominal features size and species and the set-valued feature color, one might use a feature vector with size=small, species=canis-familiaris and color=fwhite,blackg. Since we make no assumptions about the number of possible set elements, this extension of the traditional feature-vector representation is closely connected to Blum's \innnite attribute" representation. We argue that many decision tree and rule learning algorithms can be easily extended to set-valued features. We also show by example that many real-world learning problems can be eeciently and naturally represented with set-valued features; in particular , text categorization problems and problems that arise in propositionalizing rst-order representations lend themselves to set-valued features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining

Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...

متن کامل

A complex-valued neuro-fuzzy inference system and its learning mechanism

In this paper, we present a Complex-valued Neuro-Fuzzy Inference System (CNFIS) and develop its meta-cognitive learning algorithm. CNFIS has four layersan input layer with m rules, a Gaussian layer with K rules, a normalization layer with K rules and an output layer with n rules. The rules in the Gaussian layer map the m-dimensional complex-valued input features to a K-dimensional real-valued s...

متن کامل

The Generation of Fuzzy Rules from Decision Trees

This paper introduces two methods of developing fuzzy rules, using decision trees, from data with continuous valued inputs and outputs. A key problem is how to deal with continuous outputs. Here output classes are created. A crisp decision tree may then be created using a set of fuzzy output classes allowing each training example to partially belong to the classes. Alternatively, a discrete set...

متن کامل

ATTRIBUTIONAL CALCULUS A Logic and Representation Language for Natural Induction

Attributional calculus ( ) is a typed logic system that combines elements of propositional logic, predicate calculus, and multiple-valued logic for the purpose of natural induction. By natural induction is meant a form of inductive learning that generates hypotheses in human-oriented forms, that is, forms that appear natural to people, and are easy to understand and relate to human knowledge. T...

متن کامل

Combining decision trees and transformation-based learning to correct transferred linguistic representations

We present a hybrid machine learning approach to correcting features in transferred linguistic representations in machine translation. The hybrid approach combines decision trees and transformation-based learning. Decision trees serve as a filter on the intractably large search space of possible interrelations among features. Transformation-based learning results in a simple set of ordered rule...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996